CUBE CONNECT Edition Help

Using Fratar

Fratar distribution is the process of modifying a matrix of values based upon a set of production and attraction factors for each of the zones in the matrix. The process is a relatively simple iterative one:

In the first iteration, each row in the matrix is factored according to its production factor. At the end of the iteration, the row totals will match the target row values, but the column totals will most likely not match their targets.

In the second iteration each column in the modified matrix is factored according its attraction factor. At then end of the iteration, the column totals will match the target column values, but the row totals may not match their targets.

This process continues for some number of iterations; the row and column totals should converge towards the target totals. When the criteria for convergence is met, the process is complete.

A complete convergence (target row and column totals obtained for all zones) can be obtained only if the target grand control totals for rows and columns are the same. The program makes adjustments to guarantee that the target grand totals do match. It is possible that the user input values and specifications can preclude obtaining matching totals. In such cases, the program will fatally terminate.

This section discusses:

Specifying target values

There are several typical ways in which the control totals can be specified: direct values, growth factors (explicit and implicit), or some combination of both. All specifications are via the SETPA control statements. An array of production values (Ps) and an array of attraction values (As) are maintained for each purpose. To simplify this description, the term "value" will be used to mean either direct values or growth factors. There must be a set of production values and attraction values for each matrix to be factored. They are input to the program via P, A, PGF, and AGF expressions. If direct values are to be input, the P and A expressions are used. If growth factors are to be input, the PGF and AGF expressions are used. Direct values and growth factors can be mixed for a purpose, but a complete understanding of the SETPA statement is necessary.

Each of the keyword expressions is computed for an array of values for all zones. P[1] = ZI.1.HBWP2000 would cause the program to simulate the expression:

JLOOP J=1,ZONES
     P[1][J] = ZI.1.HBWP2000[J] . 
ENDJLOOP

Similarly, A[]=, PGF[]=, and AGF[]= expressions are computed for corresponding arrays. To provide the capability of mixing P and PGF for a purpose, the SETPA statement may include the basic INCLUDE and EXCLUDE filter specifications. If either, or both, of these filters are specified on a SETPA statement, they apply to all expressions on that statement. To specify P and PGF for the same purpose, separate SETPA statements are used; each would have its own zonal filter set. If the sets overlap, the latest SETPA values replace any prior values. If the final value for a P or A is 0, the program revises it to a growth factor of 1.0.

Example 1

SETPA P[1]=ZI.1.HBWP2000, A[1]=ZI.1.HBWA2000 INCLUDE=1-500
SETPA PGF[1]=ZI.2.EXTW/2 AGF[1]=PGF[1] INCLUDE=501-550

In this example, the values for zones 1-500 would be the direct values obtained directly from the ZI.1.HBWP2000 array, and the values for zones 501-550 would be the growth factors obtained from the ZI.2.EXTW array (divided by 2).

In most cases the values will be obtained from ZDATI (zonal data) files, or LOOKUP functions, but that is not an absolute requirement. Standard numerical expressions (J being the only viable variable that could be included) are used to compute the values. Sometimes, it is desirable to input specific values.

Example 2

SETPA P[1]=5000 INCLUDE=255 ;input a specific value
SETPA A[1]=sqrt(J/2+25**3.5) ;would be possible, but weird.

A special feature of these expressions is that if the result is less than zero, it is not stored. After all SETPA P,A,PGF, and AGF expressions are processed, the program performs a zonal (I) loop, obtaining the matrix values for each purpose. The matrices are obtained by solving the SETPA MW[]= expressions. Again, the INCLUDE and EXCLUDE filters are employed, but care must be exercised, if they are specified. The MW expressions are array notation, but applied for each I zone. Therefore the filters will apply to both the I and J zones.

Example 3

SETPA MW[1]=... INCLUDE=1-500
; will compute only the I=1-500 to J=1-500 portion of the matrix.

Controlling target totals

After processing the input matrix, the target totals for any growth factor values can be fully determined (value = gf * input). Next, the program adjusts the values to insure that the P and A totals match for each purpose. There are several options for adjustment; they are specified by the use of the CONTROL keywords on the SETPA statement. There may be a CONTROL specification for each purpose, and if the CONTROL for any purpose is specified more than one time, the latest value prevails. If no CONTROL is specified it defaults to PA. The valid values for CONTROL are: P, A, PA, PL, AL, and PAL.

The meanings are:

  • P - The P totals control; all values in the A array will be factored so that the A totals will match the P totals.

  • A - The A totals control; all values in the P array will be factored so that the P totals will match the A totals.

  • PA - All values in both the P and A arrays will be factored so that their totals will match the average of the initial totals.

Sometimes only certain zones are to be modified, and the remainder of the zones are to be kept constant. The program keeps track of the zones that are eligible for modification by noting which zones have target values that differ from the input value by more than one. If the letter "L" is appended to any of the CONTROL values, it indicates that the modifications are Limited to only the zones that have change. Use of the this feature can, in some cases, lead to a situation where a matrix grand total can not be properly computed. If that is the case, the program will fatally terminate.

  • PL - The P totals control. The changed zones in the A array will be factored so that the final A total will match the P total.

  • AL - The A totals control. The changed zones in the P array will be factored so that the final P total will match the A total.

  • PAL - The values in P array for zones that have P changes, and the values in the A array for zones that have A changes will be factored in such a manner that the final P and A totals match the average of the initial P and A totals.

It is impossible to modify any cell, column, or row of the input matrix that has zero to begin with. If a target value is specified for a zone that initially had no total, a warning message is issued. Traditionally, some modelers would scale a matrix by a value (usually 10, or 100), and then fill in all empty cells with one. This is not necessarily a good, or bad, solution. But, because of the potential problems associated with this approach, zero accountability is not included in this program directly. If the scaling scheme is to be applied, a prior application of the Matrix program can be used to scale and fill in a matrix in any desired manner. It could also be achieved by setting the SETPA MW expression to: max(1,mi.n.n*10).

Convergence — Iteration control

A concern is when to stop the iterating process; there are several ways to control it. The user can specify a maximum number of iterations, so that no matter how the convergence is progressing, the process will not exceed that number. After each iteration, the program computes an RMS error value based upon the integer differences between the computed and target row or column totals. After odd iterations, column total differences are checked, and after even iterations, row differences are checked. If the RMSE value is less than the MAXRMSE parameter value, the solution is achieved.

It is believed that this process will eventually reach convergence. But if, due to some unforeseen conditions, the RMSE value begins to oscillate, the program detects the oscillation, and terminates the process at the minimum RMSE. If there are multiple matrices being factored, they may reach optimum solutions at different times. If this happens, the "solved" matrices are held steady, and the others continue to be processed.

A small example of this process:

As dictated by row factoring, the row totals are correct. But, the column totals do not quite match the target. Another iteration is performed, and the results appear as:

The column totals are now on target, but the row totals are not quite on target.

This process goes on, back and forth, until either the RMSE drops to the MAXRMSE level, or the number of iterations reaches the MAXITERS value. In this example, the final solution is reached after 5 iterations (MAXRMSE=0.01 and MAXITERS=10).

All values are shown to the nearest integer and thus may not total exactly. Internally, the values are carried with more precision.

Mulitple Purposes

The number of purposes is determined by the highest P, A, PGF, AGF, or MW index found on any SETPA control statement. The program assumes that there will be purposes from one, monotonically, through that highest index. (FRATAR allows up to 20 trip purposes.) The distribution is performed prior to entering the main Matrix program I-loop. When the main I-loop is processed, MW[1] through MW[highest purpose] are initialized with the final matrices from the Fratar distribution. After the factoring process is complete, a standard Matrix program I-loop is performed.